{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "EtdtP8_cD8-G"
   },
   "source": [
    "# Custom Layers"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "r50jnbKrIMeZ"
   },
   "source": [
    "One of the reasons for the success of deep learning can be found in the wide range of layers that can\n",
    "be used in a deep network. This allows for a tremendous degree of customization and adaptation. For\n",
    "instance, scientists have invented layers for images, text, pooling, loops, dynamic programming, even for\n",
    "computer programs. Sooner or later you will encounter a layer that doesn’t exist yet in Torch, or even\n",
    "better, you will eventually invent a new layer that works well for your problem at hand. This is when it’s\n",
    "time to build a custom layer. This section shows you how."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "5F8SRjlqEh2-"
   },
   "source": [
    "Layers without Parameters\n",
    "Since this is slightly intricate, we start with a custom layer (aka Module) that doesn’t have any inherent parameters. Our first step is very similar to when we introduced modules previously. The following\n",
    "CenteredLayer class constructs a layer that subtracts the mean from the input. We build it by inheriting from the Module class and implementing the forward method.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "8RnC8rUyEpvF"
   },
   "outputs": [],
   "source": [
    "import torch\n",
    "from torch import nn\n",
    "from torch.autograd import Variable\n",
    "import torch.nn.functional as F"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "class CenteredLayer(nn.Module):\n",
    "  def __init__(self):\n",
    "      super().__init__()\n",
    "  def forward(self, x):\n",
    "    return x - x.mean()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "XV7HjPJ2HcTA"
   },
   "source": [
    "To see how it works let’s feed some data into the layer.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 33
    },
    "colab_type": "code",
    "id": "mnPzfKzxGf3T",
    "outputId": "c0e6cdfe-67d9-4bd4-b2dd-9c300f07860a"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "tensor([-2., -1.,  0.,  1.,  2.])"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "layer = CenteredLayer()\n",
    "layer(torch.FloatTensor([1, 2, 3, 4, 5]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "r7b2DC_sGiA5"
   },
   "source": [
    "We can also use it to construct more complex models.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "ndZS07wZGqcx"
   },
   "outputs": [],
   "source": [
    "net = nn.Sequential(nn.Linear(8,128), CenteredLayer())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "Q__zetKlMFuF"
   },
   "source": [
    "Let’s see whether the centering layer did its job. For that we send random data through the network and\n",
    "check whether the mean vanishes. Note that since we’re dealing with floating point numbers, we’re going\n",
    "to see a very small albeit typically nonzero number."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 33
    },
    "colab_type": "code",
    "id": "SBXVV0XKLKOm",
    "outputId": "32e5c382-1288-4944-b7b4-ec1e42279d11"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "tensor(-2.7940e-09, grad_fn=<MeanBackward1>)"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "y=net(torch.rand(4,8))\n",
    "y.mean()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "nfq8wa3aOavn"
   },
   "source": [
    "# Layers with Parameters"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "RuBt45V8OhTu"
   },
   "source": [
    "Now that we know how to define layers in principle, let’s define layers with parameters. These can be adjusted through training. In order to simplify things for an avid deep learning researcher the Parameter\n",
    "class and the ParameterDict dictionary provide some basic housekeeping functionality. In particular,\n",
    "they govern access, initialization, sharing, saving and loading model parameters. For instance, this way\n",
    "we don’t need to write custom serialization routines for each new custom layer.\n",
    "For instance, we can use the member variable params of the ParameterDict type that comes with\n",
    "the Module class. It is a dictionary that maps string type parameter names to model parameters in the\n",
    "Parameter type. We can create a Parameter instance from ParameterDict via the get function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 33
    },
    "colab_type": "code",
    "id": "rutASqVeOg94",
    "outputId": "e83e69d0-3732-4a45-cb97-92169aa75097"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "ParameterDict(  (param2): Parameter containing: [torch.FloatTensor of size 2x3])"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "params=torch.nn.ParameterDict()\n",
    "params.update({\"param2\":nn.Parameter(Variable(torch.ones(2,3)))})\n",
    "params"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "2MslVi6oMLiX"
   },
   "source": [
    "Let’s use this to implement our own version of the dense layer. It has two parameters - bias and weight.\n",
    "To make it a bit nonstandard, we bake in the ReLU activation as default. Next, we implement a fully\n",
    "connected layer with both weight and bias parameters. It uses ReLU as an activation function, where\n",
    "in_units and units are the number of inputs and the number of outputs, respectively.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "Elm6quT-QOC6"
   },
   "outputs": [],
   "source": [
    "class MyDense(nn.Module):\n",
    "    def __init__(self, units, in_units):\n",
    "        super().__init__()\n",
    "        self.weight = Variable(torch.ones(in_units, units))\n",
    "        self.bias = Variable(torch.ones(units,))\n",
    "    def forward(self, x):\n",
    "        linear = torch.matmul(x, self.weight.data) + self.bias.data\n",
    "        return F.relu(linear)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "EVK0jlxRQubj"
   },
   "source": [
    "Naming the parameters allows us to access them by name through dictionary lookup later. It’s a good idea\n",
    "to give them instructive names. Next, we instantiate the MyDense class and access its model parameters.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 33
    },
    "colab_type": "code",
    "id": "QKZvUVYKRsst",
    "outputId": "bfad20b9-ddd4-455e-c84d-4a9a03d3c645"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<bound method Module.parameters of MyDense()>"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dense = MyDense(units=3, in_units=5)\n",
    "dense.parameters"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "9LCRANkkRxj3"
   },
   "source": [
    "We can directly carry out forward calculations using custom layers."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 50
    },
    "colab_type": "code",
    "id": "EcxgO1p6R8tx",
    "outputId": "835ade9b-b9df-49d9-c0c4-4731b76af500"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "tensor([[3.1822, 3.1822, 3.1822],\n",
       "        [2.6808, 2.6808, 2.6808]])"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dense(torch.rand(2,5))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "KDNSxXBXR_1u"
   },
   "source": [
    "We can also construct models using custom layers. Once we have that we can use it just like the built-in\n",
    "dense layer. The only exception is that in our case size inference is not automagic. Please consult the\n",
    "PyTorch documentation for details on how to do this.,"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 50
    },
    "colab_type": "code",
    "id": "tYO-20Y9SCNC",
    "outputId": "46899257-9141-4bea-ab18-19f05d748b92"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "tensor([[233.8275],\n",
       "        [246.2336]])"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "net = nn.Sequential(\n",
    "MyDense(8, in_units=64),\n",
    "MyDense(1, in_units=8))\n",
    "\n",
    "net(torch.rand(2,64))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "3zGA6m60SE27"
   },
   "source": [
    "# Summary\n",
    "\n",
    "• We can design custom layers via the Module class. This is more powerful than defining a module\n",
    "factory, since it can be invoked in many contexts.\n",
    "\n",
    "• Modules can have local parameters."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "ZbFfKDqvSLrU"
   },
   "source": [
    "# Exercises\n",
    "\n",
    "1. Design a layer that learns an affine transform of the data, i.e. it removes the mean and learns an\n",
    "additive parameter instead.\n",
    "\n",
    "2. Design a layer that takes an input and computes a tensor reduction, i.e. it returns\n",
    "∑\n",
    "yk =\n",
    "i,j Wijkxixj .\n",
    "\n",
    "3. Design a layer that returns the leading half of the Fourier coefficients of the data. Hint - look up\n",
    "the fft function in PyTorch.\n"
   ]
  }
 ],
 "metadata": {
  "colab": {
   "name": "Custom Layers.ipynb",
   "provenance": [],
   "version": "0.3.2"
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
